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Summary 

Multiple imputation is a popular imputation method for general purpose estimation. Ruhin 
(1987) provided an easily applicable formula for the variance estimation of multiple imputation. 
However, the validity of the multiple imputation inference requires the congeniality condition of 
Meng (1994), which is not necessarily satisfied for method of moments estimation. This paper 
presents the asymptotic bias of Rubin’s variance estimator when the method of moments estima¬ 
tor is used as a complete-sample estimator in the multiple imputation procedure. A new variance 
estimator based on over-imputation is proposed to provide asymptotically valid inference for 
method of moments estimation. 

Some key words: Bayesian method; Congeniality; Missing at random; Proper imputation; Survey sampling. 


1. Introduction 

Imputation is often used to handle missing data. For inference, if imputed values are treated 
as if they were observed, variance estimates will generally be underestimates (Ford, 1983). To 
account for the uncertainty due to imputation, Rubin (1987, 1996) proposed multiple imputation 
which creates multiply completed datasets to allow assessment of imputation variability. 

Multiple imputation is motivated in a Bayesian framework; however, its frequentist validity is 
controversial. Rubin (1987) claimed that multiple imputation can provide valid frequentist infer¬ 
ence in various applications (for example, Clogg et ah, 1991). On the other hand, as discussed by 
Fay (1992), Kott (1995), Fay (1996), Binder & Sun (1996), Wang & Robins (1998), Robins & 
Wang (2000), Nielsen (2003), and Kim et al. (2006), the multiple imputation variance estimator 
is not always consistent. 

For multiple imputation inference to be valid, imputations must be proper (Rubin, 1987). A 
sufficient condition is given by Meng (1994), the so-called congeniality condition, imposed on 
both the imputation model and the form of subsequent complete-sample analyses, which is quite 
restrictive for general purpose estimation. Rubin’s variance estimator is otherwise inconsistent. 
Kim (2011) pointed out that multiple imputation that is congenial for mean estimation is not 
necessarily congenial for proportion estimation. Therefore, some common statistical procedures. 
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such as the method of moments estimators, can be incompatible with the multiple imputation 
framework. 

In this paper, we characterize the asymptotic bias of Rubin’s variance estimator when the 
method of moments estimator is used in the complete-sample analysis. We also discuss an alter¬ 
native variance estimator that can provide asymptotically valid inference for method of moments 
estimation. The new variance estimator is compared with Rubin’s variance estimator through two 
limited simulation studies in S5. 


2. Basic Setup 

Suppose that the sample consists of n observations (xi,yi),..., {xn,yn), which is an in¬ 
dependent realization of a random vector {X,Y). For simplicity of presentation, assume that 
y is a scalar outcome variable and Y is a p-dimensional covariate. Suppose that Xi is fully 
observed and y* is not fully observed for all units in the sample. Without loss of generality, as¬ 
sume the first r units of yi are observed and the remaining n — r units of yi are missing. Let 
5i be the response indicator of yi, that is, (5* = 1 if y* is observed and 5* = 0 otherwise. Denote 
yobs = (yi) • • •, yr)^ and Xn = (xi,..., Xn)- We further assume that the missing mechanism 
is missing at random in the sense of Rubin (1976). The parameter of interest is y = E{g{Y)}, 
where y(-) is a known function. For example, if y(y) = y, then y = E{Y)i& the population mean 
of Y, and if y(y) = /(y < 1), then y = pr(y < 1) is the population proportion of Y less than 1. 

Assume that the conditional density f{y \ x) belongs to a parametric class of models indexed 
by 9 such that /(y | x) = f{y \ x; 9) for some 0 G D and the marginal distribution of x is com¬ 
pletely unspecified. To generafe impufed values for missing oufcomes from f{y \ x; 0), we need 
fo esfimafe fhe unknown parameter 9, eifher by likelihood-based mefhods or by Bayesian mefh- 
ods. The mulfiple impufafion procedure employs a Bayesian approach fo deal wifh fhe unknown 
paramefer 9, which unfolds in fhree sfeps: 

Step 1. (Impufafion) Creafe M complefe dafasefs by filling in missing values wifh imputed 
values generafed from fhe posterior predicfive disfribufion. Specifically, fo creafe fhe yfh impufed 
dafasef, firsf generafe from fhe posterior disfribufion p{9 \ Xn, yobs)? and fhen generafe y*^^^ 
from fhe impufafion model /(y | x*; 9*^^^) for each missing y^. 

Step 2. (Analysis) Apply fhe user’s complefe-sample esfimafion procedure fo each impufed 
dafasef. Lef be fhe complefe-sample esfimafor of y = E{g{Y)} applied fo fhe yfh imputed 
dafasef and y(i) be fhe complefe-sample variance esfimafor of y^^\ 

Step 3. (Summarize) Use Rubin’s combining rule fo summarize fhe resulfs from fhe multiply 
imputed dafasefs. The mulfiple impufafion esfimafor of y is yMi = M~^ and Rubin’s 

variance esfimafor is 

)4ii(yMi) = Wm + (i + Bm, (1) 

where Wm = M"! V^) and Bm = {M - l)-i - mi?- 

If fhe mefhod of momenfs esfimafor of y = E{g{Y)} is used in step 2, fhe mulfiple impufafion 
esfimafor of y becomes 

M f ^ n M 'I 

yMi= =n~^ V^g{yi)+ ^ ^y(y*^-^^) i , (2) 

j=l I i=l i=r-\-l j=l j 

where y(^) = n-^ELi siPi) + E?=r+i g(y?^^)}. To derive fhe frequenfisf properfy of yMU 
we rely on fhe Bernsfein-von Mises fheorem (van der Vaarf, 2000; Chapter 10), which claims 



Multiple Imputation 3 

that under regularity eonditions and eonditional on the observed data, the posterior distribu¬ 
tion p{9 I Xn-iVobs) eonverges to a normal distribution with mean 6 and varianee where 
9 is the maximum likelihood estimator of 9 from the observed data and is the inverse of 
the observed Fisher information matrix with Jobs = — I xf, 9)jd9d9"’". As a 

result, assume that E{g{Y) \ Xi;9} is suffieiently smooth in 9, eonditional on the observed 

data, we have plimM^oo = E[E{g{Y) \ Xi;9*} \ Xn,yohs] = E{g{Y) \ 

Xi;9}, where = Bn means An = Bn + Op(l). Therefore, for M —)■ oo, ?7 mi eonverges to 
?7Mi,oo = n~^{Yli=iyi + where m{x;9) = E{g{Y) \ x;9}. The varianee 

estimation of r/Mi.oo needs to appropriately aeeount for the uneertainty assoeiated with the es¬ 
timate of 9, whieh is usually done using linearization methods if the imputation models are 
known (Robins & Wang, 2000; Kim & Rao, 2009). In the multiple imputation proeedure, this 
is eharaeterized in the variability between the multiply imputed datasets without referring to the 
imputation models. However, Rubin’s varianee estimator (1) requires restrietive eonditions for 
valid inferenee, whieh we diseuss in the next seetion. 


3. Main Result 

Rubin’s varianee estimator is based on the following deeomposition, 

var(57Mi) = var(57n) -h var(57Mi - Vn) + 2cov(57mi -fin,fin), (3) 

where fjn is the eomplete-sample estimator of g. Basieally, in Rubin’s varianee estimator (1), Wm 
estimates the first term of (3) and (1 -|- M~^)Bm estimates the seeond term of (3). In partieular, 
Kim et al. (2006) proved that -|- M~^)Bm} — var(57Mi — yn) for a fairly general elass of 
estimators. Thus, if the eomplete-sample varianee estimator satisfies fhe eondifion E{V^^'l) = 
vai{fjn) for y = 1,..., M, fhe bias of Rubin’s varianee estimator is 

bias(14ii) = -2 cov(7?mi - fjn-.yn)- (4) 

Rubin’s varianee estimator is asympfofieally unbiased if cov(77mi — VnjVn) — 0, whieh is 
ealled fhe eongenialify eondifion by Meng (1994). However, fhe eongenialify eondifion does 
nof hold for some eommon esfimafors sueh as fhe mefhod of momenfs esfimafors. Theorem 1 
gives fhis asympfolie bias of Rubin’s varianee esfimafor for M —)■ oo, wifh fhe proof ouflined in 
fhe online supplemenfary maferial. 

Theorem 1. Let fjn = X]r=i 9(2/*) method of moments estimator ofy = E{g(Y)} 

under complete response. Assume that EiV^^'i) = Ysxifjn) holds for j = 1,... , M. Then for 
M —>• oo, the bias of Rubin’s variance estimator is 

bias(l/Mi) = 2n"^(l - p) {E [\ai{g{Y) | X} | 5 = 0] - , (5) 

where p = r/n, Zq = —E{d‘^ log f{Y \ X;9)/d9d9'^}, m{x;9) = E{g{Y) \ x;9}, mo{x) = 
dm{x]9)/d9, fnofi = E{m 0 {X) | 5 = 0}, andfne^i = E{m 0 {X) | (5 = 1}. 

Remark 1. Under missing eomplefely al random, fhe bias in (5) simplifies to 

bias(14ii) = 2p(l - p){var(77^,MME) - var(77^,MLE)}, (6) 

where fjr^MME = r~^ ELi dihi) and 7?r,MLE = XlLi E{g{Y) \ xp, 9}, beeause 

var(7?^,MME) = r~^Y&x{g{Y)} = r~^Y&x[E{g{Y) \ X}] + r~^E[vax{g{Y) \ X}], 
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and 

var(?7r,MLE) = r~^Yai[E{g{Y) \ X}] + 

where me = E{me{X)}. Result (6) explieitly shows that Rubin’s varianee estimator is unbiased 
if and only if the method of moments estimator is as effieient as the maximum likelihood esti¬ 
mator, that is, var(57r,MME) — var(57^ mle)- Otherwise, Rubin’s varianee estimator is positively 
biased. 

Remark 2. Under missing at random, the bias of Rubin’s variance estimator can be zero, posi¬ 
tive or negative. Consider a simple linear regression model Y = -|- e, where e ~ X(0, ct^). 

For g{Y) = Y, if X contains 1, then the method of moments estimator n~^ Vi iden¬ 
tical to the maximum likelihood estimator n~^ /S being the maximum likeli¬ 

hood estimator of /3 under complete response. By Theorem 1, let Eq{-) = E{- \ 5 = 0) and 
Ei{-) = E{- I (5 = 1), the bias of Rubin’s variance estimator in (5) is bias(UMi) — 2n“^(l — 
p)a^{l — Eo{X)'^Ei{XX^)~^Ei{X)} = 0, by direct calculation considering that X contains 
1. This is consistent with the theory in Wang & Robins (1998) and Nielsen (2003). Now con¬ 
sider a simple linear regression model which contains one covariate X and no intercept, then 
the method of moments estimator is strictly less efficient than the maximum likelihood estimator 
(Matloff, 1981). The bias of Rubin’s variance estimator is 

bias(UMi) = 2n-\l - p)a^E^{X^)-\E^{X^) - Eo{XfE^{X)}, (7) 

which can be zero, positive or negative depending on the information of X in the respondent and 
non-respondent groups. See the first simulation study in §5. 


4. Alternative Variance Estimation 

In this section, we consider an alternative variance estimation method that leads to an unbi¬ 
ased variance estimator for multiple imputation regardless of whether the method of moments 
estimator or the maximum likelihood estimator is used as the complete-sample estimator in 
the multiple imputation procedure. We first decompose the multiple imputation estimator as, 
Vmi = i?Mi,oo + iVMi — i?Mi,oo)- The two terms are uncorrelated using the law of total covari¬ 
ance and the fact that ?7 mi,oo is the conditional expectation of fjMi, conditional on the observed 
data. Therefore, we have 


var(7?Ml) = var(7?Ml,oo) + var(57Ml - i?Ml,oo)- 


( 8 ) 


Note that var(57Mi — i?Mi,oo) can be estimated by M~^Bm (Kim et ah, 2006; Lemma 2). We now 
focus on estimating var(?7Mi,oo) in (8). For simplicity of presentation, all details of derivation are 
to be found in supplementary material. We show that the variance of ?7mi,oo is a sum of two terms, 

var (57 mi,oo) = n~^Vi + r~^V2, (9) 


where Vi = \ai{g{Y)} — (1 — p)E[\Six{g{Y) \ X} | (5 = 0], and V 2 = — 

2 • T n —1 • 

mo^i. 

The first term, n~^Vi, is the variance of the sample mean of g{yi) — (1 — &i){g{yi) — 
m{xi] 0)}. To estimate this term, consider Wm = i^ 


Cm = 


n‘^{M - 1 ) 


M n ( 1 ^ 




k=li=r+l 


k=l 


( 10 ) 
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We have E{Wm] = n-^wa.T{g{Y)] and E{Cm) = n-^{l - p)E[Yai{g{Y) \ X] \ 5 = ^]. 
Therefore, the first term n~^Vi ean be estimated by Wm = Wm — Cm- By the strong law of 
large numbers, pr(l^M > 0) —> 1 as n —)■ oo. 

The seeond term, r“^V 2 , refleets the variability assoeiated with the estimated value of 
9 instead of the true value 6 in the imputed values. To estimate this term, we use over¬ 
imputation in the sense that the imputation is earried out not only for the units with miss¬ 
ing outeomes, but also for the units with observed outeomes. Over-imputation has been 
used in model diagnosties for multiple imputation (Honaker et ah, 2010; Blaekwell et ah. 


2015). Let = g{y*^'"’) - M ^ Ez=i for i = l,...,n and 




M 








k = 1,... , M. De- 
and 


fine Dm,„ = (M - l)-‘ E?.i < 

DM.r = (M - l)-‘ Efc,("-' EEi - (M - 1)-' Efli «-"EE,«''‘'T. The key 
insight is based on the following observations: E{DM,n) — r ^rfiJZg ^rhe and E{DM,r) — 
r~^p‘^mj1710^1, therefore, the seeond term of (9) ean be estimated by Dm = DM,n — DM,r- 
Combining the estimators of the two terms in (9), we have the new multiple imputation varianee 
estimator, given in the following theorem. 


Theorem 2. Under the assumptions of Theorem 1, the new multiple imputation variance 
estimator is 


Vmi = Wm + Dm + M ^Bm, ( 11 ) 

where Wm = Wm — Cm, with Cm defined in (10) and Bm being the usual between-imputation 
variance in (\). Vmi B asymptotically unbiased for estimating the variance of the multiple impu¬ 
tation estimator in (2) as n ^ oo. 

Remark 3. To aeeount for the uneertainty in the varianee estimator with a small to moder¬ 
ate imputation size, a 100(1 — a)% interval estimate for g is ?7 mi =t fdf,i-a/ 2 \/^Mi> where df 
is an approximate number of degrees of freedom based on Satterthwaite’s method (1946) given 
in supplementary material. From simulation studies, we find that using df = M — 1 gives sim¬ 
ilar satisfaetory results as using the formula we provided. As a praetieal matter, df = M — lis 
preferred. 

Remark A. The proposed varianee estimator in (11) is also asymptotieally unbiased when f)n 
is the maximum likelihood estimator oi g = E{g{Y)} (see supplementary material for proof). 
Therefore, the proposed varianee estimator is applieable regardless of whether the maximum 
likelihood estimator or the method of moments estimator is used for the eomplete-sample es¬ 
timator. The priee we pay for the better performanee of our varianee estimator is an inerease 
in eomputational eomplexity and data storage spaee, whieh requires M -\-l datasets, with M 
of them ineluding the over-imputations and the last one eontaining the original observed data. 
However, when one’s eoneern is with valid inferenee of multiple imputation, as in this paper, 
our proposed varianee estimator based on over-imputation is preferred over that of Rubin’s. In 
addition, given over-imputations, the subsequent inferenee does not require the knowledge of 
the imputation models. This is important beeause data analysts typieally do not have aeeess to 
all the information that the imputers used for imputation. Our study would promote the use of 
over-imputation at the time of imputation, whieh not only allows the imputers to assess the ade- 
quaey of the imputation models, but also enables the analysts to earry out valid inferenee without 
knowledge of the imputation models. 
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5. Simulation Study 


To test our theory, we conduct two limited simulation studies. In the first simulation, 5,000 
Monte Carlo samples of size n = 2 ,000 are independently generated from Yi = 13Xi + e*, where 
(3 = 0.1, Xi ~ exp(l) and e* ~ N{Q, a‘1) with Cg = 0.5. In the sample, we assume that Xi is 
fully observed, but Yi is not. Let 6i be the response indicator of yi and di ~ Bernoulli (pi), 
where pi = 1/{1 + exp(—(^o ~ We consider two scenarios: (i) {(j)o,(j)i) = (—1.5,2) 

and (ii) = (3,-3), with the average response rate about 0.6. The parameters of in¬ 

terest are rji = E{Y) and 772 = pr(y < 0.15). For multiple imputation, M = 500 imputed 
values are independently generated from the linear regression model using the Bayesian re¬ 
gression imputation procedure discussed in Schenker & Welsh (1998), where (3 and cig 
are treated as independent with prior density proportional to a~‘^. In each imputed dataset, 
we adopt the following complete-sample point estimators and variance estimators: f)i „ = 

Ta=i Vi’ V 2 ,n = n-^ Ya=i Hvi < 0-15), V{rji^n) = n-^{n - 1)"^ T,i=iiyi “ ??i,n)^ and 
V{fj 2 ,n) = {n — l)“^ 772 ,n(l ” ?? 2 ,n)- The relative bias of the variance estimator is calculated as 
{£'(14ti) — var(f)Mi)}/var(?)Mi) x 100%. The 100(1 — a)% confidence infervals are calculafed 
as (PMI - U,i-a/ 2 V^Mhmi + t^,i-a/ 2 V^Mi), where t^^i_al 2 is the 100(1 - a/2)% quan- 
file of fhe t disfribufion wifh u degrees of freedom. For Rubin’s mefhod, v = viU 2 /{ui -|- ^ 2 ) 
wifh Vi = {M - 1)A“2, 122 = (t'com + l)(t'com + 3)“Vcom(l “ A), z/com = u - 3, and A = 
(1 -|- M~^)Bm/{Wm + (1 + M~^)Bm} (Barnard & Rubin, 1999). In our new mefhod, 12 = 
M — 1. The coverage is calculafed as fhe percenfage of Monfe Carlo samples where fhe esfimafe 
falls wifhin fhe confidence inferval. 

From Table 1, for 771 = E{Y), under scenario (i), fhe relative bias of Rubin’s variance estima¬ 
tor is 96.8%, which is consisfenf wifh our resulf in (7) wifh Ei{X'^) — Eq{X)'^E i{X) > 0, 
where Ei{X‘^) = 3.38, Ei{X) = 1.45, and Eq{X) = 0.48. Under scenario (ii), fhe relative 
bias of Rubin’s variance estimator is —19.8%, which is consisfenf wifh our resulf in (7) wifh 
Ei{X^) - Eo{X)'^Ei{X) < 0, where Ei{X^) = 0.37, Ei{X) = 0.47, and Eo{X) = 1.73. 
The empirical coverage for Rubin’s mefhod can be over or below fhe nominal coverage due 
fo variance overesfimafion or underesfimafion. On fhe ofher hand, fhe new variance esfimafor is 
essentially unbiased for fhese scenarios. 

In fhe second simulafion, 5,000 Monfe Carlo samples of size n = 200 are independenfly 
generated from T) = /3o + (3iXi -|- Cj, where (3 = (/3o, /3i) = (3, —1), Xi ~ N{2, 1) and ~ 
N (0, fjg) wifh cTg = 1. The parameters of inferesl are rji = E{Y) and 772 = pr(y < 1). We con¬ 
sider fwo differenl factors for simulation. One is fhe response mechanism: missing completely 
af random and missing af random. For missing completely af random, 5i ~ Bernoulli(0.6). For 
missing af random, 5i ~ Bernoulli(pi), where p* = 1/{1 -|- exp(—())o — ^iXj)} and (^o, <(>i) = 
(0.28,0.1) wifh fhe average response rate abouf 0.6. The ofher factor is fhe size of multiple 
impufafion, wifh fwo levels M = 10 and M = 30. 

From Table 2, regarding fhe relative bias, Rubin’s variance esfimafor is unbiased for 771 = 
E{Y), wifh absolute relafive bias of less fhan 1%, and our new variance esfimafor is comparable 
wifh Rubin’s variance esfimafor wifh absolute relative bias of less fhan 1.68%. Rubin’s variance 
esfimafor is biased upward for 772 = pr(y < 1), wifh absolute relafive bias as high as 24%; 
whereas our new variance esfimafor reduces absolufe relafive bias fo less fhan 1.74%. Regarding 
confidence inferval esfimafes, for 771 = E{Y), fhe confidence inferval calculafed from our new 
mefhod is slighfly wider fhan fhaf from Rubin’s mefhod, because our new mefhod uses a smaller 
number of degrees of freedom in fhe t disfribufion. However, for 772 = pr(y < 1), fhe confidence 
inferval calculated from our new mefhod is narrower fhan fhaf from Rubin’s mefhod even wifh a 
smaller number of degrees of freedom in fhe t disfribufion, due fo fhe overesfimafion in Rubin’s 
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Table 1. Relative biases of two variance estimators and mean width and coverages of two 
interval estimates under two scenarios in simulation one 
Relative bias Mean Width Mean Width Coverage Coverage 




(%) 


for 90% C.I. 

for 95% C.I. 

for 90% C.I. 

for 95% C.I. 

Scenario 


Rubin 

New 

Rubin 

New 

Rubin 

New 

Rubin 

New 

Rubin 

New 

1 


96.8 

0.7 

0.032 

0.023 

0.038 

0.027 

0.98 

0.90 

0.99 

0.95 


772 

123.7 

2.9 

0.022 

0.015 

0.027 

0.018 

0.98 

0.91 

1.00 

0.95 

2 


-19.8 

0.4 

0.051 

0.058 

0.061 

0.069 

0.85 

0.90 

0.91 

0.95 


V2 

-9.6 

-0.4 

0.031 

0.033 

0.037 

0.039 

0.87 

0.90 

0.93 

0.95 


C.I., confidence interval; rji = E{Y)\ 772 = vr{Y < 0.15); Rubin/New, Rubin’s/New variance estimator. 


Table 2. Relative biases of two variance estimators and mean width and coverages of 
two interval estimates under two scenarios of missingness in simulation two 
Relative Bias Mean Width Mean Width Coverage Coverage 




(%) 


for 90% C.I. 

for 95% C.I. 

for 90% C.I. 

for 95% C.I. 


M 

Rubin 

New 

Rubin 

New 

Rubin 

New 

Rubin 

New 

Rubin 

New 





Missing completely at random 





771 

10 

-0.9 

-1.58 

0.20 

0.211 

0.24 

0.25 

0.90 

0.90 

0.95 

0.95 


30 

-0.6 

-1.68 

0.192 

0.196 

0.230 

0.235 

0.90 

0.90 

0.95 

0.95 

V2 

10 

22.7 

-1.14 

0.069 

0.067 

0.083 

0.083 

0.95 

0.90 

0.98 

0.95 


30 

23.8 

-1.23 

0.068 

0.062 

0.082 

0.075 

0.94 

0.90 

0.98 

0.95 






Missing at random 





VI 

10 

-1.0 

-1.48 

0.19 

0.207 

0.23 

0.25 

0.90 

0.90 

0.95 

0.95 


30 

-0.9 

-1.59 

0.19 

0.192 

0.231 

0.23 

0.90 

0.90 

0.95 

0.95 

V2 

10 

20.7 

-1.64 

0.068 

0.066 

0.081 

0.081 

0.94 

0.90 

0.98 

0.95 


30 

21.5 

-1.74 

0.067 

0.061 

0.074 

0.071 

0.94 

0.90 

0.98 

0.95 


C.I., confidence interval; rji = E{Y); 772 = pr(T < 1); Rubin/New, Rubin’s/New variance estimator. 


method. Rubin’s method provides good empirical coverage for rji = E{Y) in the sense that 
the empirical coverage is close to the nominal coverage; however, the empirical coverage for 
772 = pr(y < 1) reaches to 95% for 90% confidence intervals, and 98% for 95% confidence 
infervals, due fo variance overesfimafion. In confrasl, our new mefhod provides more accurafe 
coverage of confidence inferval for bofh rji = E(Y) and r ]2 = pr{Y < 1) af 90% and 95% levels. 


6. Discussion 

Our method can be extended to a more general class of parameters obtained from estimating 
equations. Let tj be defined as a solufion fo fhe esfimafing equation YH=i Xi,yi) = 0. Ex¬ 
amples of r] include mean of y, proporfion of y less fhan q, pfh quanfile, regression coefficienls, 
and domain means. A similar approach can be used fo characferize fhe bias of Rubin’s variance 
esfimafor and fo develop a bias-correcfed variance estimator. 

Another extension would be developing unbiased variance estimation for the vector case of rj 
with q > 1 components. As in the scalar case, we can construct the multivariate analogues of the 
multiple imputation estimator and the variance estimator; however, finding an adequafe reference 
disfribufion for fhe sfafisfic (t^mi — (tiMi — v)/q more subfle in fhe vector case fhan 

in fhe scalar case. One pofenfial solufion is fo make a simplifying assumption fhaf fhe fraction 
of missing informafion is equal for all fhe componenfs of 77 , as discussed in Xie & Meng (2014) 
and Li el al. (1994). 
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